Optimizing Nuclear Reaction Analysis (NRA) using Bayesian Experimental Design 
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Nuclear Reaction Analysis with 3 He holds the promise to measure Deuterium depth profiles up to large 
depths. However, the extraction of the depth profile from the measured data is an ill-posed inversion problem. 
Here we demonstrate how Bayesian Experimental Design can be used to optimize the number of measurements 
as well as the measurement energies to maximize the information gain. Comparison of the inversion proper- 
ties of the optimized design with standard settings reveals huge possible gains. Application of the posterior 
sampling method allows to optimize the experimental settings interactively during the measurement process. 



PACS numbers: 02.50.Le, 02.50.Tt, 25..55.-e, 29.85.Fj 



I. INTRODUCTION 

The rising price for oil has recently shifted the focus to other possible sources of energy, preferably without adverse effects to 
the environment. One of the methods presently being developed is nuclear magnetic fusion. The objective of fusion research is 
to harness the energy provided by the fusion of hydrogen isotopes. In the fusion experiment ITER, presently under construction 
in Cadarache, France, the necessary data to design and operate anelectricity-producing plant shall be gained. ITER is a tokamak, 
an intermittent operating device in which strong magnetic fields confine a torus-shaped plasma. Since the confinement is not 
perfect (and must not be) there are always interactions between the plasma and the plasma-facing (wall) components (PFCs) 
which have to be taken into account. One of the key aspects in the licensing process of ITER is a strict upper limit of the 
total amount of radioactive tritium accumulated in the vessel walls, which is presently at 700g tritiumf]]]. The prediction of 
the amount of retained tritium is complicated by the material choice of ITER (Fig|TJ: The main vessel walls are Beryllium, 
the strike-points are made of carbon (CFC) and the other parts of the divertor are tungsten. During the operation of ITER the 
interaction of the plasma and high energy 14MeV-neutrons with the vessel walls will lead to erosion, redeposition, material 
mixing and alloy formation. Since even the hydrogen retention properties of pure materials are still subject to current research, 
a significant amount of additional experimental data is required to develop and calibrate the theoretical models which will be 
needed to process the huge number of material combinations created in ITER. 

However, even the first step - measuring hydrogen depth profiles in material composites - is challenging for many reasons; 
here we will mention only two: a) Hydrogen and its isotopes are very volatile, which can easily distort measurements of depth 
profiles and b) Hydrogen is usually the main component of the residual gas in vacuum chambers which precludes the use of 
many well-established analysis methods. 

One method which holds great promise to overcome these difficulties is the Nuclear Reaction Analysis (NRA) of deuterium 
using 3 He as probing particle. It is a specific and sensitive method, and has a sufficient analysis depth. However every data point 
takes about 30min to measure and the extraction of the concentration depth profile is an ill-posedinversion problem requiring 
the deconvolution of the measured data vector, here even more challenging than in Rutherford Backscattering]2|]. Therefore the 
experimental setup (ie the choice of the analysis energies) should provide a maximum of information. So far the most common 
choice of the 3 He energies for the measurements was simply equidistant. Using Bayesian Experimental Design the performance 




FIG. 1: In-vessel view of ITER. The surface of the main chamber is Beryllium, the material of the divertor (in lower part of the vacuum vessel) 
is carbon and tungsten. Source:[l ], published with permission of ITER. 
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of the method can be improved considerably (in some cases up to orders of magnitude) and quantitative measures can bederived 
about the expected utility of further (time consuming) measurements. sectionNuclear Reaction Analysis The basic principle of 
Nuclear Reaction Analysis is straightforward: The sample is subjected to an energetic ion beam (here 3 He) with initial energy 
Ef and incoming angle 0, which reacts predominately with the species of interest (Deuterium) and the products of the reaction 
are measured under a specified angle 9. Given the total number of impinging ions N,-, the energy dependent cross-section of the 
reaction a (E), the efficency of the detection and the geometry of the set-up jj. the measured total signal counts dj depend (in the 
limit of small concentrations) linearly on the concentration profile c (x) of the species in the depth x: 

r x ( E ?) r x ( E ?) 
di = d(Ef) =itSiJ dxo{E)c(x)=nNiJ dxo (E (x,E?))c(x)+Ei, (1) 

where £, ~ N (0, <7,-) represents normal distributed noise. Repeated measurements with different initial energies of 3 He provide 
increasing information about the Deuterium depth profile. The question addressed in the following is: Given a set of already 
measured data d (E?) which measurement energy should be chosen next? 

To evaluate Eq. [Tjwe first need to specify the cross section cr(.E') and the energy E(x) of the incident particle on its path 
through the sample. 



A. Cross-Section 

The relevant cross-section for the reaction D+ 3 He — * p + 4 He+ 18.352 MeV (in standard notation written as D ( 3 He,/?) 4 He) 
has been (re-)measured recently |3[] in the range of 550 keV to 6MeV and the obtained cross-section values have been given in 
tabular form. Using the same method as [3] we added several cross-section measurements at energies below 690keV and fitted 
both data sets taking into account also earlier measurements J3, Ht] 

, , „ £2.83962 ( 270713 * e - 2 - 2l5SE + 0.0182765) , , 

a (E [MeV]) = 829.98 * ^ 3i47626 + q 270713 * e -i.i7229£ _ 0,00123669 ^ " (2) 

using the Levenberg-Marquardt algorithm minimizing the % 2 -misfit with the variance set to d t . The cross-section is plotted in 
Fig|2k and shows a broad maximum around 630 keV and is above 3 MeV nearly constant at 8 mb/sr up to 6 MeV (above that 
there are no data available). The reaction energy is very high (Q=18.352 MeV) and most of the energy is transferred to the 
resulting proton. This leads to a very good S/N-ratio of the measurement because other particles can easily be separated by 
energy. 



B. Energy Loss 

The energy loss of the impinging 3 He-ion in the sample is determined by the stopping power S(E) of the sample 

d -^ = S{E), (3) 
ax 

which can be solved to get the depth dependent energy Ej(x) for different initial energies E®. Parameterizations and tables of S 
for different elements are given in [7]. Since the amount of hydrogen in the sample is usually well below 1% (with the exception 
of a very thin surface layer), the influence of the hydrogen concentration on the stopping power can be neglected in most cases. 



C. Simulation of Mock Data 

To simulate mock data for typical accelerator parameters a tungsten sample = 19.3g/cm 3 ^ with a (high) surface con- 
centration of 12% Deuterium, followed by an exponentially decaying Deuterium concentration down to a constant background 
level, described by 

c(x) =a *exp + a 2 = 0.1exp 5» iqi^at 1 +0 ' 02 ^ 

has been used JH. The corresponding mock data for a set of initial energies E°={500, 700, 1000, 1300, 1600, 2000, 2500, 
3000}keV is shown in Fig. 2b. The variations in the detected yields display the interplay of the increasing range of the ions 
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FIG. 2: a) Differential cross-section of the nuclear reaction D( 3 He,p) 4 He in the laboratory system with a reaction energy of Q=18.352 MeV 
(left), b) On the right hand side the typical results for an NRA measurement are shown (simulated data of a tungsten sample with an exponen- 
tially decaying D concentration profile (cf. EqQ. The uncertainties due to the counting statistics are usually dominated by the uncertainty of 
the analysis current. 



with increasing energy and the reduced cross-section at higher energies modulated with the decreasing Deuterium concentration 
at larger depths. The increase of the signal by raising the initial energy from 2500 keV to 3000 keV is already caused by the 
constant Deuterium background of 2%. The accelerator time which would be needed to obtain the 8 data points is around one 
working day taking into account the necessary interleaved calibration measurements:The ion bombardment causes an energy 
and depth dependent loss of Deuterium. Commonly a first order correction is applied by normalizing the yields with respect to 
the yields obtained from repeated calibration measurements using the same (typically low, e.g. 690keV) initial energy|23]. 

The uncertainty of the detector is given by Poisson-statistics. However, fluctuations in the beam current measurements are 
very often the dominating factor, affecting the pre-factor N, in EqQ] An accuracy of up to 3% can be achieved (e.g. by 
using the number of Rutherford-scattered 3 He ions on a thin gold-coating on top of the sample as reference). The error of the 
renormalization procedure is harder to quantify. For simplicity we will use <7, = max (5%di, y/di) as uncertainty of the data in 
the following, acknowledging that there is room forimprovement. 



II. BAYESIAN EXPERIMENTAL DESIGN 



Bayesian Experimental Design (BED) offers the tempting possibility to actively select (and optimize) the experimental pa- 
rameters for the next measurement(s) based on objective criteria. Especially if measurements are expensive or time consuming 
(like in the case of energy changes of an accelerator) it is a huge advantage to know where to look next, so as to learn as much 
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as possible. The problem of experimental design has already been studied by Lindley back in 1956 [il^] in a Bayesian setting 
and Fedorov published his influential book in 1972 ll ill - but the limitations in computational power limited the application of 
experimental design almost always to simple (linear) problems. This situation changed in the recent years and consequently 
there is a renewed interest to apply BED also to (non-linear) real-world problems (see e.g. [12] and references therein or e.g. 

11611 . Not surprising the interest is biggest in branches of physics where the experimental possibilities are severely 
restricted: Astronomy, Fusion research,... Given the excellent account of BED in lfl2ll we only summarize the key principles: In 
a first step an appropriate utility function U has to be agreed upon: It describes the value which we assign to the measurement 
results of an experiment and may include parameters like costs of an experiment, duration, parameter uncertainty etc. Several 
utility functions are considered in fl7il . With focus on parameter estimation it was proposed ifiol, [l8tl to use the Kullback-Leibler 
divergence (KLd) between the posterior and the prior distributions as utility function. The KLd for a new datum D is given by 



U K L(D,d,'n) = J da p{a\D,d, t])\og 



p{g\D,d,r}) 



p(QL\d,7]) 



(5) 



Next we try to identify the action rj which maximizes the expected utility. 'Expected' utility because we have to account for the 
prediction uncertainty for D. To compute the expected utility (EU) we have to average over the new datum D weighted by the 
marginal likelihood for the new datum given the observation of the old data d 



EU(d,rj) = ldDp(D\d,7])-U K L(D,d,r]) 

dD p(D\d,rj) da p(a\D,d, rj) log 



p(a\D,d,r\) 



p{a\d,r\) 



I 
h 



r,, , I j p(D\a,d,ri)p(a\d,ri) 
dDp(D\d,rj) da ~ log 



p(D\i,n) 

= j dD Jdap(D\a,d,ri)p(a\d,ri)\og 
dD da p(D\a,d,rj) p(a\d, )log 



p(D\a,d,ri)p(a\d,ri) 



p(a\d,7])p(D\d,T]) 



p{D\a,d^) 
p{D\d,T\) 
p{D\a,d,rj) 



Jda p (D\a,d,rj) p (a\d) 



(6) 



where we dropped the T]— dependence of the posterior of a in the last line, since our knowledge about a is not influenced by a 
possible future action. Closer inspection of Eq. |6]reveals that only two different probability distributions are required to compute 
the expected utility: the posterior distribution of a given the old data d, p (a\d) and the likelihood of the new datum D based on 
the previous measurements, p (D\a,d_,rj). 



A. The Linear Design 



Assuming that the concentration profile c(x) depends linearly on the concentrations c< (xij , i = 1 ..q at a given set of q support 
points x then EqQ]can be recast in the following form 

d = l+£=Kc + £, (7) 

where the data vector d_ is of size p, the matrix M is a pxq— matrix and the parameter-vector c has q components. However, the 
requirement of linearity applies only to the concentration parameter vector c, the functional form of the concentration may be 
much more complex, e.g. c(x) — c\ * (x — xj,) 4 + C2 * y/\x — x\ |, although almost always c(x) is chosen to be constant between 
the different support points: c(x) = Cj,Vx G [x,,x, + i] or as linear interpolation between the support points. The noise vector e is 
normally distributed e ~ jVfO. E -1 ), where E is a diagonal matrix with the entries Z.. = 1 /of. Every row of M. is given by the 

solution of Eq. [T]for a specified initial energy E®, m ■ The consideration of the uncertainties in the entries of the matrix 

due to energy straggling of the impinging particles is beyond the scope of the present paper, but see e.g. lfl9tl . 

With a Gaussian likelihood for the existing data and a flat prior for the parameters the posterior of the concentration vector 
reads 



p{c\d,T])°<p(d\c,7]) = 
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Energy /keV 

FIG. 3: Expected Utility for subsequent measurements. All measurements are performed with the initial energy E suggested by the maximum 
of the EU in the corresponding cycle ( indicated by marks on the energy axis). 



with 

4 = M r £M and co^A^M^Ld. (9) 

The posterior distribution of c including the new data point D with its uncertainty a, p (c\D,d_,r)) can similarly be cast in a 
Gaussian form. Therefore Eq. [6] can be solved analytically Hill and yields a simple closed form for the exponential utility 
GIB: 

EU(d,T/) = i(log(l + G)-r) (10) 



with 



m(ri) A l m(ri) 

G= = — . (11) 



If p (D\c, T]) is Gaussian then r = 0. The variation of the EU depends on the vector m (rj) which in turn is uniquely determined 
by the choice of the initial energy E® +1 . The optimum (maximum of the EU) can be found by a simple 1-D scan of the energy. 

The sequential design approach in action is displayed in Fig [3] Starting from the surface the concentration at increasingly 
larger depth intervals is of interest. For this example the chosen depths are nm, 80 nm, 240 nm, 470 nm and 950 nm. After 
initial measurements at 400 keV, 700 keV and 3000 keV (representing the lower and upper limit of the useful energy range for 
the measurements and one calibration measurement) the best energy for the next measurement has to be determined. The EU for 
this first cycle has a maximum at 1250keV (solid line). After a measurement with this energy the EU for the next measurement 
has its maximum at 960keV and about twice the EU than before. This, on the first glance, surprising increase of the EU can 
be made transparent: With 5 unknowns and 5 (informative) measurements the solution space of this linear problem no longer 
covers a sub-manifold of the parameter space: It 'collapses' and the volume of the 'occupied' parameter space starts to be 
determined by the measurement uncertainties. Therefore the 5-th measurement has a very high EU. In the following cycle(s) 
the amplitude of the EU is much lower since the subsequent measurements now gradually shrink the 'volume' of the parameter 
posterior distribution. As long as the EU is above the intended threshold for new measurements (which depends on the addressed 
physical problem) further measurements are indicated. 

How much better is the BED derived experiment compared to an experiment with the same number but equidistant chosen 
initial energies? The entropy of the parameter posterior distribution would be the obvious quantity to compare. However, for the 
time being, many scientists are not happy with this measure and prefer a more familiar measure, e.g. the condition number. The 
condition number of the (pseudo-)inverse of M is often used to characterize linear least squares problems B2011 and is a measure 
how strongly uncertainties in the data vector d may be amplified by multiplication with the pseudo-inverse matrix. Using this 
measure the BED optimized setting surpasses the equidistant experiment by a factor of more than 100 (!). 



B. Non-linear Design 



The analytical solution in the preceeding case was possible because several approximations have been applied: The integration 
range of the integration over the predicted datum (a positive quantity) had to be changed from J^dD to J^dD. Given the actual 
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number of counts and the uncertainties this can easily be justified. Unfortunately, a similar change of the integration limits had 
to be applied also in the parameter integration (from J Q dc to J^"dc) and here it definitely affects the results. The analysis could 
be repeated substituting the analytical integration by the numerical counterparts (e.g. using codes like VEGAS [ 20] or MCMC 
approaches). Furthermore, the uncertainty of the predicted datum D is not constant but proportional to the signal Od « D and 
therefore also the integrations over the data space have to be done numerically. Under those circumstances there is no difference 
in the computation to a non-linear experimental design problem. Additionally it turned out that the actual quantity of interest is 
the decay length of the hydrogen depth profile and that quite accurate data for the surface hydrogen concentration are available 
(additionally measuring the 4 He of the D ( 3 He,p) 4 He reaction). Therefore the optimal energy settings for the estimation of the 
parameters a\ and a 2 of concentration profiles of the functional form of Eq. [4] have to be computed. However, in non-linear 
experimental design the measured data influences the EU (in contrast to the linear case: the maximum of the EU is independent 
of the actually measured data, cf. Eq. ITOb and this poses a practical problem: The next accelerator energy has to be determined 
after the previous measurement. And longer computation times to optimize the EU, causing delays, are not tolerable. 

Here the posterior sampling approach, suggested in 11211 . proved very valuable. It turned out that sets of posterior samples 
p (a\i,ci2i\d,ri) ,i= 1..N drawn from p (a\ ,ci2\d, Tj) could be generated quite efficiently (partly due to the low dimensionality of 
the parameter space). With that sample (typically of size 1000) the denominator of the logarithm in Eq. |6]is given by a simple 
summation 

f N 

/ dap (D\a,d,T])p(cc\d) « (D\c^,d,r]) . (12) 

J 1=1 

The biggest saving comes from the fact that the posterior sample is independent from the actual value of D and of the design 
action tj : all computations are reduced to repeated evaluations of the likelihood, which can efficiently be vectorized. Finding the 
best energy is a matter of less than 5 minutes(!) on contemporary hardware (Linux-PC, 2GHz). 

In Fig. 2] three cycles of the non-linear BED are shown: After a first measurement at 500keV the posterior distribution of 
{01,02} is visualized in the upper left graph by the posterior sample. The single measurement does not allow to distinguish 
between a large decay constant ay and low constant offset ai or vice versa. The EU, plotted in the upper right graph, favors now 
a measurement at the other end of the energy range (the maximum of the utility function is encircled). After a measurement with 
3MeV 3 He the ' area' of the posterior distribution is significantly reduced (middle row, left graph): The background concentration 
is below 3% but the decay length is still quite undetermined. The EU has a maximum at 1500 keV, still with a pretty high 
EU. Performing a measurement with 1500keV localizes the posterior distribution around the true (but unknown value of a\ = 
395nm and «2 = 0.02). The next measurement should be performed at 1200keV but the EU is significantly lower than before: 
subsequent measurements are predominantly improving the statistics: a second measurement at 3 MeV provides nearly the same 
information. 



III. CONCLUSION AND OUTLOOK 



The concept of Bayesian Experimental Design allows to objectively optimize experimental designs. Here we presented two 
different approaches to optimize NRA depth profiling: First in a linear setting, allowing an analytical solution and straightforward 
parametric studies. Second, a time-critical non-linear experimental design problem which could be tackled using posterior 
sampling. Both optimization procedures may considerably increase the accuracy of the derived depth profiles compared to 
the present approach and at the same time reduce the overall measurement time by signaling a diminishing utility of further 
measurements. With the posterior sampling approach many sequential measurements can now be optimized on the fly: This 
opens up the door for a wealth of new applications of BED in the field of ion beam analysis^ as well as in other physical areas 
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FIG. 4: Three cycles of the Experimental Design process: On the left hand side 1000 samples drawn from the posterior distribution 
p{a\,a2\d_,r\) are displayed. On the right hand side the EU is plotted and the maximum is indicated by a circle. The corresponding ab- 
scissa value is the suggested next measurement energy. Performingthat measurement yields the posterior distribution given in the next row. 
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